On Trivial Solution and High Correlation Problems in Deep Supervised Hashing

نویسندگان

  • Yuchen Guo
  • Xin Zhao
  • Guiguang Ding
  • Jungong Han
چکیده

Deep supervised hashing (DSH), which combines binary learning and convolutional neural network, has attracted considerable research interests and achieved promising performance for highly efficient image retrieval. In this paper, we show that the widely used loss functions, pair-wise loss and triplet loss, suffer from the trivial solution problem and usually lead to highly correlated bits in practice, limiting the performance of DSH. One important reason is that it is difficult to incorporate proper constraints into the loss functions under the mini-batch based optimization algorithm. To tackle these problems, we propose to adopt ensemble learning strategy for deep model training. We found out that this simple strategy is capable of effectively decorrelating different bits, making the hashcodes more informative. Moreover, it is very easy to parallelize the training and support incremental model learning, which are very useful for real-world applications but usually ignored by existing DSH approaches. Experiments on benchmarks demonstrate the proposed ensemble based DSH can improve the performance of DSH approaches significant. Introduction The number of images on the Internet has been growing rapidly in recent years, necessitating highly efficient indexing techniques to facilitate large-scale image retrieval. The recent works have demonstrated that hashing is a powerful technique for efficient and accurate image retrieval (Wang et al. 2016). In particular, hashing transforms real-valued image representations into binary hashcodes. Then, based on the extremely fast basic CPU operations, like bit XOR, the hamming distance between hashcodes can be obtained with little time cost. In this way, linearly scanning the the database is fast and the memory cost for storing the database is low. Suppose we have 1 billion images and each image is represented as a 128-bit binary sequence. It requires just 16GB memory to load all images’ hashcodes and computing the hamming distance between a query image and all database images takes only a few seconds (Wang et al. 2015). Because of its outstanding efficiency and accuracy, ∗This research was supported by the National Natural Science Foundation of China (Grant No. 61571269) and the Royal Society Newton Mobility Grant (IE150997). Yuchen Guo and Xin Zhao contributed equally. Corresponding author: Jungong Han. Copyright c © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. 0.0 0.2 0.4 0.6 0.8 1.0 (a) Pair-wise, mAC=0.92 0.0 0.2 0.4 0.6 0.8 1.0 (b) Triplet, mAC=0.52 Figure 1: The correlation matrix (absolute value) of hashcode bits. Different hashcode bits are highly correlated. hashing has been applied to many computer vision tasks, including not only image retrieval, but also large-scale clustering (Gong et al. 2015), classification (Mu et al. 2014), and re-identification (Zheng and Shao 2016). Inspired by the great success of convolutional neural networks (CNN) for many computer vision tasks (He et al. 2016), the researchers have made attempt to combine CNN with hashing (Lai et al. 2015; Liong et al. 2015; Liu et al. 2016; Xia et al. 2014). In particular, by slightly modifying the network structure of CNN, especially the output layer, we can train a CNN model using the similarity supervision as a very effective hashing model which takes the raw image as input and outputs the hashcodes for this image. Based on the power of CNN, the deep supervised hashing (DSH) model can effectively exploit the semantic similarity structure of images and produce better hashcodes than non-deep hashing approaches. For example, Xia et al. (2014) has shown that a simple and straightforward DSH model can improve the mean Average Precision (mAP) over the stateof-the-art non deep approaches by 15% (from about 35% to about 50%) on CIFAR10 (Krizhevsky 2009). With elaborate designs, the mAP achieves 60% and more (Liu et al. 2016).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Class-Wise Hashing: Semantics-Preserving Hashing via Class-wise Loss

Deep supervised hashing has emerged as an influential solution to large-scale semantic image retrieval problems in computer vision. In the light of recent progress, convolutional neural network based hashing methods typically seek pair-wise or triplet labels to conduct the similarity preserving learning. However, complex semantic concepts of visual contents are hard to capture by similar/dissim...

متن کامل

Deep Discrete Supervised Hashing

Hashing has been widely used for large-scale search due to its low storage cost and fast query speed. By using supervised information, supervised hashing can significantly outperform unsupervised hashing. Recently, discrete supervised hashing and deep hashing are two representative progresses in supervised hashing. On one hand, hashing is essentially a discrete optimization problem. Hence, util...

متن کامل

Asymmetric Deep Supervised Hashing

Hashing has been widely used for large-scale approximate nearest neighbor search because of its storage and search efficiency. Recent work has found that deep supervised hashing can significantly outperform non-deep supervised hashing in many applications. However, most existing deep supervised hashing methods adopt a symmetric strategy to learn one deep hash function for both query points and ...

متن کامل

Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval

Thanks to the success of deep learning, cross-modal retrieval has made significant progress recently. However, there still remains a crucial bottleneck: how to bridge the modality gap to further enhance the retrieval accuracy. In this paper, we propose a self-supervised adversarial hashing (SSAH) approach, which lies among the early attempts to incorporate adversarial learning into cross-modal ...

متن کامل

SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval

The hashing methods have been widely used for efficient similarity retrieval on large scale image datasets. The traditional hashing methods learn hash functions to generate binary codes from hand-crafted features, which achieve limited accuracy since the hand-crafted features cannot optimally represent the image content and preserve the semantic similarity. Recently, several deep hashing method...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017